Economic Time Series Forecasting
Este post explora um teste com o pacote modeltime para forecasting de séries temporais combinando algoritmos de machine learning.
Anunciai-nos as coisas que ainda hão de vir, para que saibamos que sois deuses; ou fazei bem, ou fazei mal, para que nos assombremos, e juntamente o vejamos. Isaías 41:23

\[\\[1in]\]
Carrego as bibliotecas necessárias
library(xgboost)
library(earth)
library(tidymodels)
library(modeltime)
library(modeltime.ensemble)
library(tidyverse)
library(lubridate)
library(timetk)
library(quantmod)
library(tsibble)
library(tsibbledata)
library(dplyr)
library(tidyverse)
library(data.table)
library(tidyr)
Utilizarei novamente os mesmos dados das cotações do milho CORN negociados na bolsa de Chicago:
CORN <- getSymbols("CORN", auto.assign = FALSE,
from = "1994-01-01", end = Sys.Date())
glimpse(CORN)
An 'xts' object on 2010-06-09/2021-06-18 containing:
Data: num [1:2777, 1:6] 25.1 25.5 25.9 26 26.2 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:6] "CORN.Open" "CORN.High" "CORN.Low" "CORN.Close" ...
Indexed by objects of class: [Date] TZ: UTC
xts Attributes:
List of 2
$ src : chr "yahoo"
$ updated: POSIXct[1:1], format: "2021-06-20 14:24:05"
periodicity(CORN)
Daily periodicity from 2010-06-09 to 2021-06-18
CORN <- CORN %>%
as.data.table() %>%
as_tibble()
class(CORN)
[1] "tbl_df" "tbl" "data.frame"
glimpse(CORN)
Rows: 2,777
Columns: 7
$ index <date> 2010-06-09, 2010-06-10, 2010-06-11, 2010-06-14, 2010-06~
$ CORN.Open <dbl> 25.12, 25.46, 25.88, 25.99, 26.24, 26.26, 26.20, 26.22, ~
$ CORN.High <dbl> 25.25, 25.46, 25.88, 26.11, 26.24, 26.44, 26.20, 26.60, ~
$ CORN.Low <dbl> 25.12, 25.46, 25.79, 25.99, 25.97, 26.20, 25.82, 26.22, ~
$ CORN.Close <dbl> 25.15, 25.46, 25.79, 26.11, 25.97, 26.32, 26.08, 26.39, ~
$ CORN.Volume <dbl> 1700, 200, 500, 2200, 7000, 2400, 1600, 3100, 9500, 2870~
$ CORN.Adjusted <dbl> 25.15, 25.46, 25.79, 26.11, 25.97, 26.32, 26.08, 26.39, ~
CORN %>%
plot_time_series(index, CORN.Close, .interactive = FALSE)

splits <- initial_time_split(CORN, prop = 0.8)
Os modelos aqui testados são 14, sendo que eles contemplam:
Família ARIMA e com erros XGBoost;
Ajuste/Alisamento Exponencial e com ajustes sazonais e também modelo Croston e Theta
TBATS;
prophet e prophet_xgboost
Naive e SNaive
Rede Neural Artificial com componente Auto Regressivo
# Modelo 1: auto_arima ----
model_fit_arima_no_boost <- arima_reg() %>%
set_engine(engine = "auto_arima") %>%
fit(CORN.Close ~ index,
data = training(splits))
# Modelo 2: arima_boost ----
model_fit_arima_boosted <- arima_boost(
min_n = 2,
learn_rate = 0.9
) %>%
set_engine(engine = "auto_arima_xgboost") %>%
fit(CORN.Close ~ index,
data = training(splits))
# Modelo 3: ets ----
model_fit_ets <- exp_smoothing() %>%
set_engine(engine = "ets") %>%
fit(CORN.Close ~ index,
data = training(splits))
# Modelo 4: tbats ----
model_fit_tbats <- seasonal_reg() %>%
set_engine(engine = "tbats") %>%
fit(CORN.Close ~ index,
data = training(splits))
# Modelo 5: prophet xgb ----
model_fit_prophet_xb <- prophet_boost(
learn_rate = 0.8 ) %>%
set_engine("prophet_xgboost") %>%
fit(CORN.Close ~ index,
data = training(splits))
# Modelo 6: stlm_ets ----
model_fit_stlm_ets <- seasonal_reg() %>%
set_engine("stlm_ets") %>%
fit(CORN.Close ~ index,
data = training(splits))
# Modelo 7: stlm arima ----
model_fit_stlm_arima <- seasonal_reg() %>%
set_engine("stlm_arima") %>%
fit(CORN.Close ~ index,
data = training(splits))
# Modelo 8: arima xgb padrao ----
model_fit_arima_padrao <-arima_boost(
# Padroes p, d, q
seasonal_period = 7,
non_seasonal_ar = 5,
non_seasonal_differences = 1,
non_seasonal_ma = 3,
seasonal_ar = 1,
seasonal_differences = 0,
seasonal_ma = 1,
# XGBoost
tree_depth = 6,
learn_rate = 0.8
) %>%
set_engine("arima_xgboost") %>%
fit(CORN.Close ~ index,
data = training(splits))
# Modelo 9: ets padrao ----
model_fit_ets_padrao <- exp_smoothing(
seasonal_period = 7, # padrao de 7 dias de trade
error = "multiplicative",
trend = "additive",
season = "multiplicative"
) %>%
set_engine("ets") %>%
fit(CORN.Close ~ index,
data = training(splits))
# Modelo 10: ets croston ----
model_fit_ets_croston <- exp_smoothing(
smooth_level = 0.2
) %>%
set_engine("croston") %>%
fit(CORN.Close ~ index,
data = training(splits))
# Modelo 11: ets theta ----
model_fit_ets_theta <- exp_smoothing(
) %>%
set_engine("theta") %>%
fit(CORN.Close ~ index,
data = training(splits))
# Modelo 12: naive ----
model_fit_naive <- naive_reg(
) %>%
set_engine("naive") %>%
fit(CORN.Close ~ index,
data = training(splits))
# Modelo 13: snaive ----
model_fit_snaive <- naive_reg(
seasonal_period = 7
) %>%
set_engine("snaive") %>%
fit(CORN.Close ~ index,
data = training(splits))
# Modelo 14: rede neural autoregressiva ----
model_fit_nnetar <- nnetar_reg(
) %>%
set_engine("nnetar")
set.seed(123)
model_fit_nnetar <- model_fit_nnetar %>%
fit(CORN.Close ~ index,
data = training(splits))
Avalio o ajuste de cada método para a mesma amostra da série temporal:
models_tbl <- modeltime_table(
# -- Modelo -- -- Numero --
model_fit_arima_no_boost, # Modelo 1
model_fit_arima_boosted, # Modelo 2
model_fit_ets, # Modelo 3
model_fit_tbats, # Modelo 4
model_fit_prophet_xb, # Modelo 5
model_fit_stlm_ets, # Modelo 6
model_fit_stlm_arima, # Modelo 7
model_fit_arima_padrao, # Modelo 8
model_fit_ets_padrao, # Modelo 9
model_fit_ets_croston, # Modelo 10
model_fit_ets_theta, # Modelo 11
model_fit_naive, # Modelo 12
model_fit_snaive, # Modelo 13
model_fit_nnetar # Modelo 14
)
models_tbl
# Modeltime Table
# A tibble: 14 x 3
.model_id .model .model_desc
<int> <list> <chr>
1 1 <fit[+]> ARIMA(2,1,0)
2 2 <fit[+]> ARIMA(0,1,2)
3 3 <fit[+]> ETS(M,A,N)
4 4 <fit[+]> BATS(0, {0,0}, -, -)
5 5 <fit[+]> PROPHET
6 6 <fit[+]> SEASONAL DECOMP: ETS(M,A,N)
7 7 <fit[+]> SEASONAL DECOMP: ARIMA(0,1,0)
8 8 <fit[+]> ARIMA(5,1,3)(1,0,1)[7]
9 9 <fit[+]> ETS(M,AD,M)
10 10 <fit[+]> CROSTON METHOD
11 11 <fit[+]> THETA METHOD
12 12 <fit[+]> NAIVE
13 13 <fit[+]> SNAIVE [7]
14 14 <fit[+]> NNAR(1,1,10)[5]
models_tbl %>%
modeltime_calibrate(new_data = testing(splits)) %>%
modeltime_accuracy(
metric_set = metric_set(
mape,
smape,
mae,
rmse,
rsq # R^2
)
)
# A tibble: 14 x 8
.model_id .model_desc .type mape smape mae rmse rsq
<int> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 ARIMA(2,1,0) Test 13.4 12.8 1.95 2.43 2.62e-5
2 2 ARIMA(0,1,2) Test 13.4 12.8 1.95 2.43 5.07e-5
3 3 ETS(M,A,N) Test 12.5 14.4 2.12 3.43 7.60e-2
4 4 BATS(0, {0,0}, -, -) Test 13.4 12.8 1.95 2.43 NA
5 5 PROPHET Test 12.4 13.4 2.01 2.94 3.44e-2
6 6 SEASONAL DECOMP: ETS(M,A,~ Test 12.6 14.5 2.13 3.45 7.60e-2
7 7 SEASONAL DECOMP: ARIMA(0,~ Test 13.3 12.7 1.94 2.43 3.34e-5
8 8 ARIMA(5,1,3)(1,0,1)[7] Test 13.4 12.8 1.95 2.43 9.26e-5
9 9 ETS(M,AD,M) Test 13.2 12.7 1.94 2.42 1.36e-3
10 10 CROSTON METHOD Test 13.5 12.9 1.96 2.44 NA
11 11 THETA METHOD Test 12.0 13.6 2.01 3.25 7.60e-2
12 12 NAIVE Test 13.4 12.8 1.95 2.43 NA
13 13 SNAIVE [7] Test 13.4 12.8 1.95 2.44 2.93e-6
14 14 NNAR(1,1,10)[5] Test 17.2 15.5 2.39 2.89 2.14e-3
Veremos o modelo prophet p. ex.:
list(model_fit_prophet_xb) %>%
as_modeltime_table()
# Modeltime Table
# A tibble: 1 x 3
.model_id .model .model_desc
<int> <list> <chr>
1 1 <fit[+]> PROPHET
calibration_tbl <- models_tbl %>%
modeltime_calibrate(new_data = testing(splits), quiet = FALSE)
calibration_tbl
# Modeltime Table
# A tibble: 14 x 5
.model_id .model .model_desc .type .calibration_data
<int> <list> <chr> <chr> <list>
1 1 <fit[+]> ARIMA(2,1,0) Test <tibble [556 x 4]>
2 2 <fit[+]> ARIMA(0,1,2) Test <tibble [556 x 4]>
3 3 <fit[+]> ETS(M,A,N) Test <tibble [556 x 4]>
4 4 <fit[+]> BATS(0, {0,0}, -, -) Test <tibble [556 x 4]>
5 5 <fit[+]> PROPHET Test <tibble [556 x 4]>
6 6 <fit[+]> SEASONAL DECOMP: ETS(M,A,N) Test <tibble [556 x 4]>
7 7 <fit[+]> SEASONAL DECOMP: ARIMA(0,1,0) Test <tibble [556 x 4]>
8 8 <fit[+]> ARIMA(5,1,3)(1,0,1)[7] Test <tibble [556 x 4]>
9 9 <fit[+]> ETS(M,AD,M) Test <tibble [556 x 4]>
10 10 <fit[+]> CROSTON METHOD Test <tibble [556 x 4]>
11 11 <fit[+]> THETA METHOD Test <tibble [556 x 4]>
12 12 <fit[+]> NAIVE Test <tibble [556 x 4]>
13 13 <fit[+]> SNAIVE [7] Test <tibble [556 x 4]>
14 14 <fit[+]> NNAR(1,1,10)[5] Test <tibble [556 x 4]>
calibration_tbl %>%
modeltime_forecast(
new_data = testing(splits),
actual_data = CORN
) %>%
plot_modeltime_forecast(
.legend_max_width = 25, # For mobile screens
.interactive = TRUE
)
O modelo do tipo ensemble é um modelo misto com os pesos ajustados pela acurácia de cada um dos 14 univariados estimados até aqui
# Modelo 15: ensemble ----
model_fit_ensemble <- models_tbl %>%
ensemble_weighted(
loadings = c
(
#-- Peso -- # -- Modelo --
1, # Modelo 1
2, # Modelo 2
5, # Modelo 3
6, # Modelo 4
10, # Modelo 5
9, # Modelo 6
8, # Modelo 7
1, # Modelo 8
7, # Modelo 9
2, # Modelo 10
2, # Modelo 11
2, # Modelo 12
1, # Modelo 13
10 # Modelo 14
),
scale_loadings = TRUE
)
model_fit_ensemble
-- Modeltime Ensemble -------------------------------------------
Ensemble of 14 Models (WEIGHTED)
# Modeltime Table
# A tibble: 14 x 4
.model_id .model .model_desc .loadings
<int> <list> <chr> <dbl>
1 1 <fit[+]> ARIMA(2,1,0) 0.0152
2 2 <fit[+]> ARIMA(0,1,2) 0.0303
3 3 <fit[+]> ETS(M,A,N) 0.0758
4 4 <fit[+]> BATS(0, {0,0}, -, -) 0.0909
5 5 <fit[+]> PROPHET 0.152
6 6 <fit[+]> SEASONAL DECOMP: ETS(M,A,N) 0.136
7 7 <fit[+]> SEASONAL DECOMP: ARIMA(0,1,0) 0.121
8 8 <fit[+]> ARIMA(5,1,3)(1,0,1)[7] 0.0152
9 9 <fit[+]> ETS(M,AD,M) 0.106
10 10 <fit[+]> CROSTON METHOD 0.0303
11 11 <fit[+]> THETA METHOD 0.0303
12 12 <fit[+]> NAIVE 0.0303
13 13 <fit[+]> SNAIVE [7] 0.0152
14 14 <fit[+]> NNAR(1,1,10)[5] 0.152
Construimos a projeção com o modelo misto:
# Calibragem
calibration_tbl <- modeltime_table(
model_fit_ensemble
) %>%
modeltime_calibrate(testing(splits), quiet = FALSE)
# Forecast vs amostra de teste (dados originais pra frente antes do corte pra amostra treino)
calibration_tbl %>%
modeltime_forecast(
new_data = testing(splits),
actual_data = CORN
) %>%
plot_modeltime_forecast(.interactive = TRUE)
A reamostragem de séries temporais é uma estratégia importante para avaliar a estabilidade dos modelos ao longo do tempo. No entanto, é um tanto quanto penoso fazer isso, pois requer vários loops for para gerar as previsões para vários modelos e, potencialmente, vários grupos de séries temporais.
resamples_tscv <- training(splits) %>%
time_series_cv(
assess = "2 years",
initial = "5 years",
skip = "1 years",
# Normally we do more than one slice, but this speeds up the example
slice_limit = 1
)
# Fit and generate resample predictions
model_fit_ensemble_resample <- calibration_tbl %>%
modeltime_fit_resamples(
resamples = resamples_tscv,
control = control_resamples(verbose = TRUE)
)
-- Fitting Resamples --------------------------------------------
0.22 sec elapsed
# A new data frame is created from the Modeltime Table
# with a column labeled, '.resample_results'
model_fit_ensemble_resample
# Modeltime Table
# A tibble: 1 x 6
.model_id .model .model_desc .type .calibration_da~ .resample_resul~
<int> <list> <chr> <chr> <list> <list>
1 1 <ensembl~ ENSEMBLE (WEIGHTE~ Test <tibble [556 x ~ <lgl [1]>
Plota o gráfico
model_fit_ensemble_resample %>%
plot_modeltime_resamples(.interactive = TRUE)
Constrói a projeção para fora da amostra desde a última observação para os próximos 24 meses:
refit_tbl <- calibration_tbl %>%
modeltime_refit(data = CORN)
refit_tbl %>%
modeltime_forecast(h = "2 years", actual_data = CORN) %>%
plot_modeltime_forecast(
.legend_max_width = 25, # For mobile screens
.interactive = TRUE
)
Na forma bivariada inserimos o volume de negociação como variável explicativa, ainda mesmo que esperando uma performance estatística inferior, sabemos que pelo princípio econômico, a volatilidade é facilmente capturada ao observarmos os picos de volume negociados diariamente.
# Modelo 1: auto_arima ----
model_fit_arima_no_boost <- arima_reg() %>%
set_engine(engine = "auto_arima") %>%
fit(CORN.Close ~ CORN.Volume + index,
data = training(splits))
# Modelo 2: arima_boost ----
model_fit_arima_boosted <- arima_boost(
min_n = 2,
learn_rate = 0.9
) %>%
set_engine(engine = "auto_arima_xgboost") %>%
fit(CORN.Close ~ CORN.Volume + index,
data = training(splits))
# Modelo 3: ets ----
model_fit_ets <- exp_smoothing() %>%
set_engine(engine = "ets") %>%
fit(CORN.Close ~ CORN.Volume + index,
data = training(splits))
# Modelo 4: tbats ----
model_fit_tbats <- seasonal_reg() %>%
set_engine(engine = "tbats") %>%
fit(CORN.Close ~ CORN.Volume + index,
data = training(splits))
# Modelo 5: prophet xgb ----
model_fit_prophet_xb <- prophet_boost(
learn_rate = 0.8 ) %>%
set_engine("prophet_xgboost") %>%
fit(CORN.Close ~ CORN.Volume + index,
data = training(splits))
# Modelo 6: stlm_ets ----
model_fit_stlm_ets <- seasonal_reg() %>%
set_engine("stlm_ets") %>%
fit(CORN.Close ~ CORN.Volume + index,
data = training(splits))
# Modelo 7: stlm arima ----
model_fit_stlm_arima <- seasonal_reg() %>%
set_engine("stlm_arima") %>%
fit(CORN.Close ~ CORN.Volume + index,
data = training(splits))
# Modelo 8: arima xgb padrao ----
model_fit_arima_padrao <-arima_boost(
# Padroes p, d, q
seasonal_period = 7,
non_seasonal_ar = 5,
non_seasonal_differences = 1,
non_seasonal_ma = 3,
seasonal_ar = 1,
seasonal_differences = 0,
seasonal_ma = 1,
# XGBoost
tree_depth = 6,
learn_rate = 0.8
) %>%
set_engine("arima_xgboost") %>%
fit(CORN.Close ~ CORN.Volume + index,
data = training(splits))
# Modelo 9: ets padrao ----
model_fit_ets_padrao <- exp_smoothing(
seasonal_period = 7, # padrao de 7 dias de trade
error = "multiplicative",
trend = "additive",
season = "multiplicative"
) %>%
set_engine("ets") %>%
fit(CORN.Close ~ CORN.Volume + index,
data = training(splits))
# Modelo 10: ets croston ----
model_fit_ets_croston <- exp_smoothing(
smooth_level = 0.2
) %>%
set_engine("croston") %>%
fit(CORN.Close ~ CORN.Volume + index,
data = training(splits))
# Modelo 11: ets theta ----
model_fit_ets_theta <- exp_smoothing(
) %>%
set_engine("theta") %>%
fit(CORN.Close ~ CORN.Volume + index,
data = training(splits))
# Modelo 12: naive ----
model_fit_naive <- naive_reg(
) %>%
set_engine("naive") %>%
fit(CORN.Close ~ CORN.Volume + index,
data = training(splits))
# Modelo 13: snaive ----
model_fit_snaive <- naive_reg(
seasonal_period = 7
) %>%
set_engine("snaive") %>%
fit(CORN.Close ~ CORN.Volume + index,
data = training(splits))
# Modelo 14: rede neural autoregressiva ----
model_fit_nnetar <- nnetar_reg(
) %>%
set_engine("nnetar")
set.seed(123)
model_fit_nnetar <- model_fit_nnetar %>%
fit(CORN.Close ~ CORN.Volume + index,
data = training(splits))
Avalio o ajuste de cada método para a mesma amostra da série temporal:
models_tbl_bivariada <- modeltime_table(
# -- Modelo -- -- Numero --
model_fit_arima_no_boost, # Modelo 1
model_fit_arima_boosted, # Modelo 2
model_fit_ets, # Modelo 3
model_fit_tbats, # Modelo 4
model_fit_prophet_xb, # Modelo 5
model_fit_stlm_ets, # Modelo 6
model_fit_stlm_arima, # Modelo 7
model_fit_arima_padrao, # Modelo 8
model_fit_ets_padrao, # Modelo 9
model_fit_ets_croston, # Modelo 10
model_fit_ets_theta, # Modelo 11
model_fit_naive, # Modelo 12
model_fit_snaive, # Modelo 13
model_fit_nnetar # Modelo 14
)
models_tbl_bivariada
# Modeltime Table
# A tibble: 14 x 3
.model_id .model .model_desc
<int> <list> <chr>
1 1 <fit[+]> REGRESSION WITH ARIMA(2,1,0) ERRORS
2 2 <fit[+]> ARIMA(0,1,2) W/ XGBOOST ERRORS
3 3 <fit[+]> ETS(M,A,N)
4 4 <fit[+]> BATS(0, {0,0}, -, -)
5 5 <fit[+]> PROPHET
6 6 <fit[+]> SEASONAL DECOMP: ETS(M,A,N)
7 7 <fit[+]> SEASONAL DECOMP: REGRESSION WITH ARIMA(0,1,0) ERRORS
8 8 <fit[+]> ARIMA(5,1,3)(1,0,1)[7] W/ XGBOOST ERRORS
9 9 <fit[+]> ETS(M,AD,M)
10 10 <fit[+]> CROSTON METHOD
11 11 <fit[+]> THETA METHOD
12 12 <fit[+]> NAIVE
13 13 <fit[+]> SNAIVE [7]
14 14 <fit[+]> NNAR(1,1,10)[5]
Tabela de avaliação da acurácia dos modelos bivariados no tempo:
models_tbl_bivariada %>%
modeltime_calibrate(new_data = testing(splits)) %>%
modeltime_accuracy(
metric_set = metric_set(
mape,
smape,
mae,
rmse,
rsq # R^2
)
)
# A tibble: 14 x 8
.model_id .model_desc .type mape smape mae rmse rsq
<int> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 REGRESSION WITH ARIMA(2,1,~ Test 11.2 12.0 1.82 2.79 7.92e-2
2 2 ARIMA(0,1,2) W/ XGBOOST ER~ Test 13.5 12.9 1.97 2.49 2.33e-3
3 3 ETS(M,A,N) Test 12.5 14.4 2.12 3.43 7.60e-2
4 4 BATS(0, {0,0}, -, -) Test 13.4 12.8 1.95 2.43 NA
5 5 PROPHET Test 12.4 13.4 2.01 2.94 3.44e-2
6 6 SEASONAL DECOMP: ETS(M,A,N) Test 12.6 14.5 2.13 3.45 7.60e-2
7 7 SEASONAL DECOMP: REGRESSIO~ Test 13.3 12.7 1.94 2.43 3.68e-2
8 8 ARIMA(5,1,3)(1,0,1)[7] W/ ~ Test 13.5 12.9 1.97 2.49 9.83e-4
9 9 ETS(M,AD,M) Test 13.2 12.7 1.94 2.42 1.36e-3
10 10 CROSTON METHOD Test 13.5 12.9 1.96 2.44 NA
11 11 THETA METHOD Test 12.0 13.6 2.01 3.25 7.60e-2
12 12 NAIVE Test 13.4 12.8 1.95 2.43 NA
13 13 SNAIVE [7] Test 13.4 12.8 1.95 2.44 2.93e-6
14 14 NNAR(1,1,10)[5] Test 82.6 47.9 12.1 16.5 5.55e-2
Veremos o modelo prophet p. ex.:
list(model_fit_prophet_xb) %>%
as_modeltime_table()
# Modeltime Table
# A tibble: 1 x 3
.model_id .model .model_desc
<int> <list> <chr>
1 1 <fit[+]> PROPHET
calibration_tbl_bivariada <- models_tbl %>%
modeltime_calibrate(new_data = testing(splits), quiet = FALSE)
calibration_tbl_bivariada
# Modeltime Table
# A tibble: 14 x 5
.model_id .model .model_desc .type .calibration_data
<int> <list> <chr> <chr> <list>
1 1 <fit[+]> ARIMA(2,1,0) Test <tibble [556 x 4]>
2 2 <fit[+]> ARIMA(0,1,2) Test <tibble [556 x 4]>
3 3 <fit[+]> ETS(M,A,N) Test <tibble [556 x 4]>
4 4 <fit[+]> BATS(0, {0,0}, -, -) Test <tibble [556 x 4]>
5 5 <fit[+]> PROPHET Test <tibble [556 x 4]>
6 6 <fit[+]> SEASONAL DECOMP: ETS(M,A,N) Test <tibble [556 x 4]>
7 7 <fit[+]> SEASONAL DECOMP: ARIMA(0,1,0) Test <tibble [556 x 4]>
8 8 <fit[+]> ARIMA(5,1,3)(1,0,1)[7] Test <tibble [556 x 4]>
9 9 <fit[+]> ETS(M,AD,M) Test <tibble [556 x 4]>
10 10 <fit[+]> CROSTON METHOD Test <tibble [556 x 4]>
11 11 <fit[+]> THETA METHOD Test <tibble [556 x 4]>
12 12 <fit[+]> NAIVE Test <tibble [556 x 4]>
13 13 <fit[+]> SNAIVE [7] Test <tibble [556 x 4]>
14 14 <fit[+]> NNAR(1,1,10)[5] Test <tibble [556 x 4]>
calibration_tbl_bivariada %>%
modeltime_forecast(
new_data = testing(splits),
actual_data = CORN
) %>%
plot_modeltime_forecast(
.legend_max_width = 25, # For mobile screens
.interactive = TRUE
)
O modelo do tipo ensemble é um modelo misto com os pesos ajustados pela acurácia de cada um dos 14 univariados estimados até aqui
# Modelo 15: ensemble ----
model_fit_ensemble_bivariada <- models_tbl_bivariada %>%
ensemble_weighted(
loadings = c
(
#-- Peso -- # -- Modelo --
1, # Modelo 1
10, # Modelo 2
5, # Modelo 3
6, # Modelo 4
10, # Modelo 5
9, # Modelo 6
8, # Modelo 7
10, # Modelo 8
7, # Modelo 9
2, # Modelo 10
2, # Modelo 11
2, # Modelo 12
1, # Modelo 13
6 # Modelo 14
),
scale_loadings = TRUE
)
model_fit_ensemble_bivariada
-- Modeltime Ensemble -------------------------------------------
Ensemble of 14 Models (WEIGHTED)
# Modeltime Table
# A tibble: 14 x 4
.model_id .model .model_desc .loadings
<int> <list> <chr> <dbl>
1 1 <fit[+]> REGRESSION WITH ARIMA(2,1,0) ERRORS 0.0127
2 2 <fit[+]> ARIMA(0,1,2) W/ XGBOOST ERRORS 0.127
3 3 <fit[+]> ETS(M,A,N) 0.0633
4 4 <fit[+]> BATS(0, {0,0}, -, -) 0.0759
5 5 <fit[+]> PROPHET 0.127
6 6 <fit[+]> SEASONAL DECOMP: ETS(M,A,N) 0.114
7 7 <fit[+]> SEASONAL DECOMP: REGRESSION WITH ARIMA(0,1,0) E~ 0.101
8 8 <fit[+]> ARIMA(5,1,3)(1,0,1)[7] W/ XGBOOST ERRORS 0.127
9 9 <fit[+]> ETS(M,AD,M) 0.0886
10 10 <fit[+]> CROSTON METHOD 0.0253
11 11 <fit[+]> THETA METHOD 0.0253
12 12 <fit[+]> NAIVE 0.0253
13 13 <fit[+]> SNAIVE [7] 0.0127
14 14 <fit[+]> NNAR(1,1,10)[5] 0.0759
Construimos a projeção com o modelo misto:
# Calibragem
calibration_tbl_bivariada <- modeltime_table(
model_fit_ensemble_bivariada
) %>%
modeltime_calibrate(testing(splits), quiet = FALSE)
# Forecast vs amostra de teste (dados originais pra frente antes do corte pra amostra treino)
calibration_tbl_bivariada %>%
modeltime_forecast(
new_data = testing(splits),
actual_data = CORN
) %>%
plot_modeltime_forecast(.interactive = TRUE)
A reamostragem de séries temporais é uma estratégia importante para avaliar a estabilidade dos modelos ao longo do tempo. No entanto, é um tanto quanto penoso fazer isso, pois requer vários loops for para gerar as previsões para vários modelos e, potencialmente, vários grupos de séries temporais.
resamples_tscv_bivariada <- training(splits) %>%
time_series_cv(
assess = "7 years",
initial = "1 years",
skip = "1 years",
# Normally we do more than one slice, but this speeds up the example
slice_limit = 1
)
# Fit and generate resample predictions
model_fit_ensemble_resample_bivariada <- calibration_tbl_bivariada %>%
modeltime_fit_resamples(
resamples = resamples_tscv,
control = control_resamples(verbose = TRUE)
)
-- Fitting Resamples --------------------------------------------
0.01 sec elapsed
# A new data frame is created from the Modeltime Table
# with a column labeled, '.resample_results'
model_fit_ensemble_resample_bivariada
# Modeltime Table
# A tibble: 1 x 6
.model_id .model .model_desc .type .calibration_da~ .resample_resul~
<int> <list> <chr> <chr> <list> <list>
1 1 <ensembl~ ENSEMBLE (WEIGHTE~ Test <tibble [556 x ~ <lgl [1]>
Plota o gráfico
model_fit_ensemble_resample %>%
plot_modeltime_resamples(.interactive = TRUE)
Constrói a projeção para fora da amostra desde a última observação para os próximos 24 meses:
refit_tbl_bivariada <- calibration_tbl_bivariada %>%
modeltime_refit(data = CORN)
refit_tbl_bivariada %>%
modeltime_forecast(h = "2 years", actual_data = CORN) %>%
plot_modeltime_forecast(
.legend_max_width = 25, # For mobile screens
.interactive = TRUE
)
[continuar escrevendo…]
Business Science in R bloggers Introducing Modeltime Recursive: Tidy Autoregressive Forecasting with Lags, Disponível em: https://www.r-bloggers.com/2021/04/introducing-modeltime-recursive-tidy-autoregressive-forecasting-with-lags/
Business Science in R bloggers Introducing Modeltime: Tidy Time Series Forecasting using Tidymodels, Disponível em: https://www.r-bloggers.com/2020/06/introducing-modeltime-tidy-time-series-forecasting-using-tidymodels/
Business Science, Github, Ensemble Algorithms for Time Series Forecasting with Modeltime, Disponível em: https://business-science.github.io/modeltime.ensemble/ acesso em junho de 2021.
CRAN, Getting Started with Modeltime, Disponível em: https://cran.r-project.org/web/packages/modeltime/vignettes/getting-started-with-modeltime.html
CRAN, Package ‘modeltime’, Disponível em: https://cran.r-project.org/web/packages/modeltime/modeltime.pdf junho de 2021.
CRAN, Package ‘modeltime.ensemble’ Disponível em: https://cran.r-project.org/web/packages/modeltime.ensemble/modeltime.ensemble.pdf junho de 2021.
CRAN, Package ‘modeltime.resample’ Disponível em: https://cran.r-project.org/web/packages/modeltime.resample/modeltime.resample.pdf junho de 2021.
Github, Ensemble Algorithms for Time Series Forecasting with Modeltime, Disponível em: https://github.com/business-science/modeltime.ensemble acesso em junho de 2021.
HANSEN, B. Econometrics, University of Wisconsin, Department of Economics. Disponível em: https://www.ssc.wisc.edu/~bhansen/econometrics/Econometrics.pdf March 11, 2021.
USAI, D. Time Series Machine Learning Analysis and Demand Forecasting with H2O & TSstudio, Disponível em: https://rpubs.com/DiegoUsai/565288